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based on ranks 

Jeffrey D. Hart*i 

Texas A&M University 

Abstract: A rank-based test of the null hypothesis that a regressor has no 
effect on a response variable is proposed and analyzed. This test is identical 
in structure to the order selection test but with the raw data replaced by 
ranks. The test is nonparametric in that it is consistent against virtually any 
smooth alternative, and is completely distribution free for all sample sizes. The 
asymptotic distribution of the rank-based order selection statistic is obtained 
and seen to be the same as that of its raw data counterpart. Exact small 
sample critical values of the test statistic are provided as well. It is shown 
that the Pitman-Noether efficiency of the proposed rank test compares very 
favorably with that of the order selection test. In fact, their asymptotic relative 
efficiency is identical to that of the Wilcoxon signed rank and t-tests. An 
example involving microarray data illustrates the usefulness of the rank test 
in practice. 



1. Introduction 

It is arguable that tlic best and most aestlietically appealing ideas in science are 
those that combine the virtues of simplicity and effectiveness. Perhaps no idea in 
statistics better achieves this combination than rank-based methods. In this paper 
such methods are brought to bear on the problem of nonparametrically testing 
lack-of-fit in regression. 

Nonparametric lack-of-fit tests based on smoothing methods have received a 
great deal of attention in recent years. Many of these tests are discussed by Hart 
[3]. An omnipresent problem of smoothing-based tests, as with any other test, is 
uncertainty about the sampling distribution of the test statistic. The bootstrap 
seems to have become the principal way of dealing with this problem when the 
distribution of the data is unknown. However, the bootstrap is not a panacea for at 
least two reasons. First of all, its performance usually breaks down when the data do 
not possess at least two moments, and secondly, even if the moment assumptions are 
met, it only guarantees test validity asymptotically, as the number of data increase 
without bound. 

Rank tests have long been an effective way of dealing with uncertainty about 
the data distribution. An important, but not nearly exhaustive, set of references 
on the subject is Chernoff and Savage [1], Puri and Sen [14], Randies and Wolfe 
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[lo], Hettmansperger and McKean ["i]. Rank tests are relatively insensitive to as- 
sumptions about the underlying data distribution. Indeed, under quite general as- 
sumptions they are often completely distribution-free, no matter the sample size. 
Furthermore, they usually sacrifice little, if any, power relative to analogous tests 
based on the raw data. The purpose of this paper is to consider certain smoothing- 
bascd lack-of-fit tests when they are applied to ranks of the responses rather than 
the responses themselves. It will be shown that many of the conclusions about rank 
tests from other settings carry over to the nonparamctric lack-of-fit problem. 

Aside from Hart [■'>] , it appears that the work of Lombard [' i] is more closely re- 
lated to that in the current paper than any other in the existing literature. Lombard 
[9] considers the problem of testing a sequence of independent random variables for 
mean or scale constancy. He proposes that the pcriodogram of the ranks of the 
data be computed, and defines a test statistic as a weighted sum of centered pcri- 
odogram ordinates. The omnibus nature of this statistic and the fact that it uses 
Fourier transforms of ranks make it similar in spirit to statistics proposed in the 
current paper. 

Many, and probably most, areas of applied research require good methods for 
testing the fit of parametric regression models. The methodology proposed in this 
paper thus has the potential of being extremely useful in interdisciplinary research. 
Evidence of this potential is provided in Section 4, which involves an example from 
the field of microarray analysis. Our methods are used to establish that test samples 
differ from reference samples in terms of DNA copy number. Such differences are 
known to be correlated with cancer incidence in the test subjects. 

In the next section we will introduce a nonparametric regression model and 
propose some rank-based tests of the null hypothesis that the regression function is 
constant. Asymptotic properties of one of these tests are developed in this section as 
well. In Section 3 we discuss how the basic ideas from Section 2 may be extended to 
test the fit of a linear model. The aforementioned illustration involving microarrays 
is the subject of Section 4, concluding remarks are made in Section 5, and proofs 
of three theorems are given in an Appendix. 

2. Testing the no-effect hypothesis 

Suppose that one observes responses Yi, . . . , y„ at fixed design points xi, . . . , x„ 
which, mainly for convenience, we take to be Xi = (i — 1/2) /n, i = 1, . . . ,n. In 
general it is assumed that 



where r is some function that is square integrable over [0,1] and ei,...,e„ are 
independent and identically distributed. A fundamentally important problem in 
this setting is establishing that the response is indeed related to the design variable. 
Formally, we wish to test the null hypothesis 



(2.1) 



Fi = r(2;,) -f ej, i = l,...,n. 



Ho: r = C, 



against the alternative 




where C is an unknown constant and f = Jq fix) dx. 
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A nonparametric means of estimating r is to use an orthogonal series fm of the 
form 



(2.2) f^{x) 




m — Q 

m 

2 cos(7rjx), 1 < m < 71, 



where 



1 " 

Tl ^ ^ 



n 

'i=i 



, n — 1. 



The integer m is the smoothing parameter of fm- Many tests of Hq vs. Hi based 
on Fourier series smoothers have been proposed; see Chapter 7 of Hart [3] for a 
description of a number of these. One such test is the so-called order selection (OS) 
test of Eubank and Hart [2] . Define 

1 ^ 

Tn = max — y , 

l<m<ri m ^ — ' a 
3 = 1 

where ct^ is a consistent estimator of = Var(ei), assuming that it exists finite. 
The OS test rejects i/o for large values of Tn- This test derives its name from the 
fact that rejecting i7o if and only if Tn > A is equivalent to rejecting iJo if and only 
if the maximizer of 



(2.3) Mn{m) 



0, m = 

m 

^ 2n(^|/(T^ - Am, TO= l,2,...,n- 1 



with respect to m is greater than 0. When A = 2, Mn is precisely the well-known 
Mallows' criterion (Mallows [11]) for choosing the order of the smoother f„i. 

A rank-based analog of any series-based test is easily defined by using ranks 
instead of Yi, . . . , 1^ in the coefficients 0i, . . . , (fin-i- Let R{Yi) denote the rank of 
Yi among Yi , . . . , y„ , and define 

Ut = , i==l,...,n, 

n-\- 1 

and 

I " 

(2.4) ^ -S^U.iCOs{TTjx.i), j = 1, . . . ,71- 1. 

II ^-^ 

i=l 

A rank-based analog of T„ is 

1 " 2n<^2 
Hn = max — > — 

l<m<n 711 ^ 1/12' 

which was first proposed in Hart [.-i] . Here we shall verify properties of i?„ that were 
only conjectured in Hart [■')]- 

2.1. Null distribution of the test statistic 

When the null hypothesis is true, the distribution of i?„ is completely independent 
of the distribution of ei- For small values of n one may determine the distribution 
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Table 1 

Tail probabilities for R„ at large sample quantiles. Values for 5 < n < 10 are exact, 
while those for n = 15, 20 and 30 are gotten from simulation. The latter values 
are accurate to within 0.0005 with 95% confidence 



n 




Large sample 


quant ile 




3.221 


4.179 


6.745 


10.850 


5 


0.1000 


0.0167 


0.0000 


0.0000 


6 


0.1028 


0.0417 


0.0000 


0.0000 


7 


0.1040 


0.0476 


0.0004 


0.0000 


8 


0.1034 


0.0487 


0.0022 


0.0000 


9 


0.1042 


0.0482 


0.0039 


0.0000 


10 


0.1030 


0.0485 


0.0053 


0.0000 


15 


0.1030 


0.0496 


0.0078 


0.0002 


20 


0.1020 


0.0496 


0.0086 


0.0003 


30 


0.1016 


0.0501 


0.0089 


0.0006 


oo 


0.10 


0.05 


0.01 


0.001 



of Rn exactly by tabulating the value of i?„ for each permutation of the ranks 
1, . . . ,n. The following theorem provides the asymptotic distribution of i?„. 

Theorem 2.1. Suppose that model (2.1) holds with r = constant and ei,...,e„ 

independent and identically distributed as F, where F is a continuous cumulative 
distribution function (cdf). Then 

{°° P(y'^ > it) 1 
_y VAj ) \ = Git), 
— ' t 

where Xj has the chi-squared distribution with j degrees of freedom, j = 1, 2, . . .. 

The distribution G is precisely the same limiting distribution that T„ has under 
appropriate moment conditions (Chapter 7, Hart [3]). The minimum number of 
moments that ei must possess in order for r„ to have limiting distribution G is 
two. This in itself provides a key motivation for use of the rank-based test, since 
(2.5) requires no moment conditions whatsoever. 

Table 1 gives tail probabilities for i?„ at selected large sample percentiles. It 
seems clear from this table that use of the large sample percentiles will result in a 
conservative test so long as a < 0.05. At a = 0.10, use of the large sample quantiles 
results in tests that are only slightly liberal. 



2.2. Power of rank-based order selection test 

Another attractive feature of many rank tests is that they give up remarkably 
little power in comparison to tests based on the raw data. They can even be more 
powerful than raw-data counterparts when the parent distribution is sufficiently 
heavy-tailed. The reader is referred to Puri and Sen [14] for a review of some of 
these results. 

Before investigating power of the rank-based OS test, we can gain some intuition 
by considering the function 

oc 

/^(a;) = 2^0j cos(7rja;), 

i=l 

where 

(pj = lim , j = 1,2, . . . . 
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It should not be surprising that the rank-based OS test has power comparable to 
that of an ordinary OS test applied to model (2.1) with r = ^ and Var(ei) = 1/12. 
Letting I{A) denote the indicator of event A, we have R{Yi) = ^O^i — ^) 

and hence 



E4>, = 



^ n n 

i=i k=i 
1 " 

-"^"^H {r{xi) - r(xfc))cos(7rja;,), 



n 

i—l k^i 



where H is the cdf of ei — £2- It follows that 



1 

E(j)j = / fi{x) cos{TTjx) dx + 0{n~^), 







where the O term is bounded uniformly in j and 



~ / H{r{x) — r{u)) du. 
Jo 

When the alternative r is very close to the null, the form of /i relative to r is 
more transparent. If r{x) = then 

E$j = ^^^^ [ f3{x)cos{TTjx)dx + o{n~^/'^), 



where h is the density correpsonding to H. For the local alternative f3/^/n, one thus 
anticipates that the rank-based OS test will have power comparable to the ordinary 
OS test against the local alternative h{Q)[3/y/n with cr = ■\/l/12. This indeed turns 
out to be the case, as we now show. 

The relative power of the rank-based and ordinary OS tests will be investigated 
using the notion of Pitman-Noether efficiency. We consider a sequence of local 
alternatives to Hq having the form 

(2.6) r„(x) = 



where /3^(x) dx > 0. Let P,i(r, /) and P^ir, /) denote the power of the ordinary 
and rank-based OS tests, respectively, when the regression function is r, the error 
density is /, the sample size is n and the level of significance is the same for each 
test. Under appropriate conditions, each of -Pn(r„,/) and P,f(r„,/) has a limit 
greater than and less than 1 as n — s- cx). If we can find a function of n, call it n*, 
such that 

hm P„(r„,/) = lim P„^.(r„,/), 

n — ^cxD n — ^oc 

then the Pitman-Noether asymptotic relative efficiency (ARE) of the rank-based 
OS test to the ordinary OS test is 

e(/) = lim 4. 

n^oc n 

This notion of efficiency relates how many more (or fewer) observations are needed 
for a rank-based order selection test to have the same power as the analogous test 
based on the raw data. 

We now state a theorem concerning the limiting power of the rank-based OS 
test. 
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Theorem 2.2. Suppose that model (2.1) holds with r[x) = r„(x), as defined by 
(2.6). Assume that the following two conditions hold: 

• The density h (of ei — e^) is Lipschitz continuous in a neighborhood of 0. 

• The function (3 is Lipschitz continuous on [0,1] , except perhaps at finitely 
many points where it is simply discontinuous. 

Let Zi, Z2, . . . be i.i.d. standard normal random variables, h be the density of 61 — 62, 
and fij = (3{x) cos{njx) dx, j = 1, 2, . . .. Then, for a sequence of level a tests, 

lim P,f(r„,/) = P(r >t„), 

n — *oo 

where t^ is the 1 — a quantile of G, the limiting null distribution o/_R„, and 
r = max — V (Zj + V24/i(0)/3/ 



m>l TO ^ \ ^ 
J = l 



Under the conditions in Theorem 2.2 and the additional condition that ei has 
four moments finite. Theorem 7.10, p. 201 of Hart [3] states that 



(2.7) limP„(r„,/) = P 



n— *oo 



1 ™/ 

max — > Zi + ^—^ > tr 



m>l TO 



where = Var(ei). 

Combining Theorem 2.2 with (2.7) allows us to obtain the ARE of the rank-based 
to the ordinary OS test. 

Theorem 2.3. Let the conditions of Theorem 2.2 hold and suppose that 61 has 
four moments finite. Then the ARE of the rank-based OS test to the ordinary OS 
test is 

(2.8) 77(7) = 12^2 fy f^ix)dx 

Interestingly, the ARE in Theorem 2.3 is precisely the same as those of the 
Wilcoxon signed-rank test to the t-test and the Mann- Whitney test to the two- 
sample t-test in classical versions of the one- and two-sample location problems, 
respectively (Randies and Wolfe [15]). This fact parallels results of Hettmansperger 
and McKean [-5] in the problem of testing the fit of a linear model. It is shown in 
Hettmansperger and McKean [5] (pp. 176-178) that the asymptotic efficiencies of 
rank-based tests relative to the classical F-test are the same as corresponding AREs 
in the one- and two-sample location problems. In both Hettmansperger and McKean 
[5] and our setting, the ARE turns out to be simply the ratio of noncentrality 
parameters. In our problem, for example, 



24/i2(0)/32 



ARE^ ; =12a^fe^(0). 



For any scale family f{x) = fo{x/a)/a such that x^fo^x) dx = 1, we have 
Vif) = 12( r fSix)dx^ 
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A special case of interest is when /o is the standard normal density, (/). It is well- 
known that e((/)) = S/tt « 0.955, and hence the loss in power using the rank-based 
test is quite small. 

For long-tailed distributions, the rank test can be more efficient than the ordinary 
OS test. Let tk denote the density of a t distribution with k degrees of freedom, 
where fc is a positive integer. Then it can be verified that e{tk) > I for 5 < k < 18, 
with 6(^5) ~ 1.24. Of interest are empirical studies to determine how large n must 
be in order for such power improvements to be realized. Results of McKean and 
Sheather [12] show that rank-based F-tests for testing the fit of linear models are 
indeed more powerful in small samples than an ordinary _F-tcst when the error 
distribution is heavier tailed than the normal. 

Borrowing an idea from the theory of linear rank statistics, one could replace 
Ui in (2.4) by a normal score, i.e., ^~^{Ui), where $ is the standard normal cdf. 
Chernoff and Savage [1] show that using normal scores in the classical two-sample 
location problem yields an ARE, relative to the t-test, that is always at least 1. One 
anticipates a similar benefit from using normal scores in the setting of the current 
paper, but we do not pursue this question further here. 



2.3. Other series-based rank tests 



As mentioned previously, a number of lack-of-fit tests based on orthogonal series 
have been proposed. Rank-based analogs of any of these can be performed simply 
by replacing the raw data by ranks. Perhaps the most fundamental series-based 
test is the regression analog of a Neyman smooth test (Neyman [!.>]). The basis for 
such a test is the statistic 



i=i 



2n</)2 



where m is fixed prior to data collection. This statistic has a simple and familiar 
asymptotic null distribution, i.e., chi-squarcd with m degrees of freedom. According 
to Lehmann [8], it also has a uniformly most powerful property when the data are 
normally distributed and the function r has the form 

m 

r{x) = ao -I- 2 aj cos(7rja;). 

A difficulty with Sn.m is its dependence on m, a poor choice of which could 
result in a loss of power or even inconsistency. Ledwina [7] and Kuchibhatla and 
Hart [(»] proposed that one use a test statistic of the form Sn,m with a data-driven 
choice for m. Here we define a rank-based analog of such a statistic. First define 
the Mallows-like criterion M^: 

{0, TO = 

m 
^24n(^2_2TO, TO== l,...,n- 1. 

If TO is the maximizer of , then we may define 

m 

55, = ^24n0f, 

and reject Hq for large values of 5^,^^. This statistic is distribution- free under ffp 
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and its critical values are easily approximated by simulation. One could also use 
a BIC-like criterion in place of M,f- by replacing the term 2m by {\ogn)m. The 
criterion, Mallows or BIG, used to choose m in Sn.m has a nontrivial effect on 
properties of the resulting tests. The reader is referred to Section 7.7.4 of Hart [-i] 
for a discussion of these properties. 

Hart [4] proposed a Bayesian-motivatcd lack-of-fit test with very good overall 
power properties. A rank-based analog of this statistic is 

n-l 

Under Hq, Bn is distribution-free and converges in distribution to the random 
variable X^j^i 6xp(Z|/2) as ti — > oo, where Zi, Z2, . ■ . are i.i.d. standard normal. 
Under appropriate regularity conditions the ARE of a test based on Bn relative to 
that of the analogous raw-data test (as in Hart [4]) is equal to (2.8). This means 
that for normal and many other error distributions _B„ will inherit all the desirable 
power properties discussed in Hart [ . ] . 

3. Testing the fit of linear models 

Any of the rank tests previously discussed can be used to test the fit of a linear 
model. Here one wishes to test a hypothesis of the form 

p 

(3.1) ffo : K^^) -E^J-^j(^)' 

where ri, . . . , Tp are known functions and 9i, . . . ,6p unknown parameters. Rather 
than ranking the observed data, one ranks residuals from the fitted linear model. 
Otherwise the tests are done in precisely the same way as before. Now, one is 
effectively testing the null hypothesis that the expected value of each residual is 0. 
An excellent reference for rank tests based on residuals in a linear models setting 
is Hettmansperger and McKean [5]. 

There is, however, an important difference between testing for constancy of E{Yi) 
and testing the fit of a linear model. In the latter problem the joint distribution of 
the ranks of residuals depends, in general, upon the error distribution. Under the 
null hypothesis (3.1), the residuals have the form 

— ^ ^ 

P 

= e» + ^ i = l,...,n. 

3 = 1 

If Oj is a y^-consistent estimator of 8j for each j (as, for example, least squares 
estimators generally would be), then the asymptotic distribution of the OS rank 
test applied to residuals will generally have the same limit distribution as before, 
i.e., (2.5). However, in small samples the distribution of that statistic can depend 
upon the error distribution, due to the terms X]j=i(^i ^ * = 1, ■ • ■ 

A main competitor of rank methods for dealing with uncertainty about the error 
distribution is the bootstrap. In the linear models context each of these methods 
yields valid tests only in the limit, as n 00. An interesting question is which 
of the two methods tends to require larger sample sizes for validity? Intuitively it 
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seems that the rank test would be less sensitive to the underlying error distribution 
than is the bootstrap. After all, in the absence of any error in estimating the 
regression coefficients, the rank test would be completely distribution free, whereas 
a bootstrapped OS test (using the raw data) would not be. Nonetheless, we make 
no claims in this regard and postpone a comparison of the bootstrap and rank 
methods to future research. 



4. A data analysis 

Here we apply the rank-based OS test to data collected by the authors of Snijders 
et al. [f6]. The data are from a microarray experiment that measured genome- wide 
DNA copy number. The variable considered here is the ratio of dye intensities for 
test and reference samples at a given marker along a chromosome of interest. Each 
intensity is proportional to the number of marker copies. The reference samples are 
diploid, and hence each reference marker has only two copies. It is of interest to 
detect regions on a chromosome where the test samples may have more or fewer 
copy numbers than the corresponding reference samples. Such variations in copy 
number are known to be correlated with cancer incidence; see, for example, Lucito 
et al. [10]. 

Data sets for four different chromosomes (gotten from cell line GM03563) are 
shown in Figure 1. In each graph, the horizontal axis is marker location and the 
vertical axis is the normalized average of three readings of log2(/3), where p is the 
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Fig 1. Microarray data comparing DNA copy number in test and reference samples. 
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Table 2 

Test statistics and P-values for rank and ordinary OS tests. The values in columns 3 and 5 are 
large-sample P-values for the rank and ordinary OS tests, respectively 



Data set 


Rn 


P-value 


Tn 


P-value 


Chromosome 1 


8.44 


0.00379 


11.05 


0.00089 


Chromosome 3 


47.50 


0.00000 


306.92 


0.00000 


Chromosome 4 


5.26 


0.02465 


5.77 


0.01795 


Chromosome 9 


17.44 


0.00003 


38.99 


0.00000 



aforementioned ratio of intensities (test over reference). Both the ordinary and 
rank versions of the OS test were applied to the four data sets to test for constancy 
of expected \og2{p)- In fact, the hypothesis of interest is that this expectation is 
identically 0. The OS tests have no power against a simple shift alternative, and 
hence if the null hypothesis of constancy is not rejected, then one would still want 
to investigate the (perhaps unlikely) possibility that the true function is identical 
to a nonzero constant. 

Values of test statistics and large sample P-values arc given in Table 2. Since the 
sample sizes for chromosomes 1, 3, 4 and 9 are 135, 85, 171 and 109, respectively, 
large sample tests seem more than adequate. Both the ordinary and rank-based 
OS tests are significant at the 0.05 level of significance for all four data sets, in- 
dicating that there are differences between test and reference samples for all four 
chromosomes. An idea of how test and reference differ can be gotten by consider- 
ing Fourier series smooths of the form (2.2). It is also interesting to consider the 
function estimated by a smooth of this same form but with (f) replaced by 0. This 
smooth estimates the function /i(a;), which determines the power of the rank-based 
test. Estimates of r{x)/(j and \/\2^[x) are given in Figure 2. In each of the four 
plots, the truncation point, to, of the estimate f™ is chosen by criterion (2.3) with 
A = 4.18, and for simplicity the other truncation point is taken to be the same. (In 
the case of chromosome 9, two outliers were omitted in computing the smooths.) 
It is interesting how well the shapes of the rank-based smooths mimic those of the 
raw-data smooths. For these data, the rank-based smooths have somewhat smaller 
amplitudes, which is consistent with the rank statistics being smaller than the or- 
dinary OS statistics. Of course, this need not be the case, since, as pointed out 
earlier, the rank tests will sometimes be more powerful. 

It is worthwhile mentioning that the rank-based OS tests are essentially impervi- 
ous to the two outliers in the chromosome 9 data, evident in Figure 1. It is thus not 
necessary to delete these cases to determine their effect on the question of whether 
or not the underlying curve is constant. In contrast, in applying the ordinary OS 
test to the whole data set, one would wonder whether significance, or lack thereof, 
was caused by these two points alone. (In this case it turns out that the ordinary 
OS test is highly significant whether or not the two points in question are included.) 

5. Conclusions 

Nonparametric tests of the null hypothesis that a response and a regressor are 
unrelated have been proposed and analyzed. The tests are rank-based versions 
of the order selection test and are completely distribution free under the single 
condition that the regression errors are independent and identically distributed. 
These tests have the same surprisingly good power properties possessed by rank 
tests in simpler testing problems. Rank-based versions of other smoothing-inspired 
lack of fit statistics were also discussed. One such test promises to have better 
overall power properties than the rank-based order selection test. The widespread 
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Fig 2, Fourier series smooths. In each plot, the solid line is a smooth of the raw data, and the 
dashed line is a smooth of ranks. The smooths are scaled so as to estimate the functions that 
determine the power of the ordinary and rank-based OS tests. 



nature of regression analysis suggests that these results could have a substantial 
impact on a number of different fields. 

The proposed tests can also be applied to test the fit of linear models. In this 
setting the tests would be applied to residuals, and hence would only be asymptot- 
ically distribution free. The bootstrap is another important tool for approximating 
the sampling distribution of test statistics. An interesting problem for future re- 
search is a comparison of bootstrap and rank-based nonparametric lack-of-fit tests. 
A key question is whether or not large sample rank tests have smaller level error in 
finite samples than do bootstrap tests. 



Appendix 

Here we provide proofs of Theorems 2.1, 2.2 and 2.3. Throughout the proofs, generic 
positive (and finite) constants are denoted Ci, C2, . . . . 



A.l. Proof of Theorem 2. 1 



Define Zjn = \/ 24n0j , j = 1, . . . , n — 1, in which case 



Rn = max 

l<j<n J 



1 ^ 
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Now define Vj = F(ej), j = 1, . . . , n, and note that Vi, . . . ,Vn is a random sample 
from the U{0, 1) distribution. Then Zjn may be expressed 

where 

-j^ n 1 ^ 

0j ~ — Vi cos{TTjxi) and Sj = — ^^(?7i — Vi) cos{Trjxi). 

" i=l " 2=1 

Since F is a monotone transformation, Ui ~ R{Vi)/ (n + 1), « = 1, . . . , n. 
The statistic i?„ may be expressed as 



1 ^ 

i?„ = max - V 24n((/i,^ + 20^5^ + J,^). 

l<i<n 7 — ' 



l<j<n ■] 

At this point we observe that the C/(0, 1) distribution satisfies the conditions of 
Theorem 7.2, pp. 168-169, of Hart and therefore 

(A.l) max - V 2An(l)1 T, 

where T has distribution G, as defined in (2.5). Using the following lemma, the 
proof will then be complete if we can show that 



1 ^ 

(A.2) max - V n\2(j),5., + 6l\ 0. 

l<j<n 7 — ' 
•' i=l 

Lemma A.l. For any real numbers Ai, Bi, . . . , Am, Bm and any positive number e, 
I max {Aj + Bj) — max Aj\ < e 

whenever maxi<j<„j \Bj\ < e. 

Proof. Defining ji and j2 to be such that 

Aj + Bj < Aj^ + Bj^ for j = l,...,m, 



and 

we have 

Obviously 
and hence 



Aj < Aj^ for j = 1, . . . , m, 



max {Aj +Bj)- max A^ = Aj, + Bj, - A^.,. 



Aj, + Bj, < Aj, + Bj, < Aj, + Bj, , 



Bj, < Aj, + Bj, - Aj, < Bj, , 
which proves the result. □ 
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Using Lemma A.l, result (A.l) and Holder's inequality, (A. 2) will be proven if 
we can show that 

j 

0. 

l<j<n J 



1 

tn = max — > ndj : 

i<i<n j ^-^ 



Now 



3 = 1 



where Ejn = EJ2i=i^i- As we shall see shortly, Ejn < je/n for all j and all n 
sufficiently large, and so 

n-l / ] . \ 

j=i \i=i I 

2 



n-l 

^ Hp 



.1=1 



> ( - - 

71 



(A.3) < EVar(E'5') -^r. 

j = l \i=l 



Our next step is to determine Ejn- To this end, define Aj ^ Vj — Uj, j ~ 1, . . . ,n, 
and note that 

3 n 

[TTlXr) 



= ^Var(Ai)EE'^°^'(^ 

i—l r—1 

(A. 4) +~2 Cov(Ai, A2) E E E cos(7ria;r) cos{TTiXs)- 

i—l r—1 s^r 

We now need the following fundamental properties of the cosine basis: 

n 

(A. 5) E] cos(7ria;r) =0, i = 1, . . . , n — 1, 

r=l 

and 

n 

(A. 6) '^^cos'^ {irixr) = n/2, i~l,...,n—l. 

r=l 

Applying these two properties to (A. 4) yields 

E,n = ^ [Var(Ai) - Cov(Ai, A2)] . 



Using basic distributional properties of order statistics and ranks, it is straightfor- 
ward to show that 

24n2 



E3n = T;^+jOin-'). 
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The bound (A. 3) thus satisfies 



ra-l / j 

EVar 

j=i 



n 



n-l 



and the proof will be complete if we can show that 



\i=l 



for all j and n. 
Now, 



^ n 71 J 

— ArAs cos(7rzXr) cos(7rza:s) 

r— 1 s— 1 z— 1 

^ n n 

riEE^-^-^-' 



where Wrs ~ '^^^_-j^cos{TriXr) coa{TriXs) and for ease of notation wc suppress the 
dependence of Wrs on j. We have 

CJ \ ^ n n n n 

E-^n = ;^EEEE^-^-^(^-^-^^"^-)-^'» 
i—l / r—1 s—1 u—1 v — 1 



and since we have already established that < C^j /n , we need only consider 
Sn/n^- The quantity 5„ may be expressed 



£;(Ai A2A3A4) E E E E ^rsWuv 

r s u V 



2 ^ ^ ^ WrsWuu + 4 E E E "^rsWru 
r s u r s u 



~E{A\Al 



E E "^rrWss + 2 E E 



(A.7) 



where each sum extends over distinct indices only; for example, is a 

summation over ordered triples (r, s, u) such that no two of r, s, u are the same. 

Each of the sums in the last expression can be shown to be smaller in absolute 
value than Cj^j'^v?. Consider, for example, X^r ^'■sW'u'u; the innermost 

sum of which is 

j 

'^^WrsWuv = Wrs E/ '''-'^('''*^") E/ '^'^^^^^^"-^ 
V i—l V 

j 

= —Wrs cos(7ria;„) [cos{TTiXr) + cos{TTiXs) + cos(7riXti)] , 
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where we have used (A. 5). Using m addition (A. 6), we have 

u V i—1 

+ cos{'KiXr) cosymxs) 



Continuing in this way, we finaUy see that 

cos^(7ria;r) 



r s u V 



.i=l 

which is positive and smaller than C^j^n^ for all n sufficiently large. 

We turn now to the expected values in (A. 7), each of which is bounded in absolute 
value by 



1 " / 



fr^ B{i,n-i + l) \n+l 
xB{i + 4 - j, 71 - z + f), 

where B{-,-) denotes the beta function. After some straightforward but tedious 
calculations, it is seen that 

Combining this result with previous ones, we have Var(^^^j^ (5|) < C2pln^, as 
required, and the proof is complete. 

A. 2. Proof of Theorem 2.2 

Let Ejn denote -E^j, j = — 1, and define 5j = — Ejn, j = 1, . . . ,n — 1. 

Then 



^3 = ^3 + 7^ 



and it suffices to show that 



max — V (V2A^5.i + V24/i(0)/3, 

l<m<n TO ^ V ■' ^ ' ■' 

J = l 

(A.8) sup — ^ (Zj + V24/i(0)/3^ 



2 



in distribution as n cxd, and that 

^ m 

(A.9) lim max — V(V^i;,„ - /i(0)/?,)^ = 0. 

n^oo l<m<ri TO 



1 < n ... 

J = l 
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The proof of (A. 8) is virtually the same as the proof of Theorem 2.1 and hence 
omitted. 

To prove (A. 9), first notice that 



max — y^{^/nEjn - h{0)(3jf < 

Km <Ti m ^ — ^ 



^-1 



^ m ^71 — 1 

max - V(^^^;,„ - h{0)Pj)\ V {V^Ej. 

l<Tn<v/ri tn — — •' 



We have 



n-l 



(A.IO) ^(V^£j„-/i(0)/3,)2 < 
Defining 



1/2 



1/2- 



'/3(x)-/3(x,) 



MO) E/^: 



and /!„ = ^^n{xi)/n, it follows from orthogonality of the cosines that 

n — 1 71 



(A.ll) 



Now, /i„(xi) = 7?(0)+n '^/'^YJk^iiPi^i)- P{xk))h{-qikn), where ryifcn is a number 
between and {(3{xi) — f3{xk)) / y/n. It follows that 

1 " 

\/n{lJLn{Xi) - fin) = -y^(PiXi) - P{xk))h{riikn) 

(A.12) ^ E('^(^^) - 

J = l 

Since /3 and h are both bounded functions, (A.ll) and (A.12) imply that 
n X]J=i '33-'^ be bounded by the same constant for all n. By the piecewise 
continuity of /3 on [0,1], J2]=i ^ finite limit, which, using (A.IO), finishes 

the proof that Y^^j=i iV^^jn — ^(0)/3j)^/v^ tends to as n — > oo. 
Finally, we need to show that 



(A.13) lim max — V(Vn-B,« - h(0)p.f = 0. 



+ l<m<V^ m 



J = l 



We may write \/riEjn ~ h{0)(3j — Anj + Bnj + C„j, where 



C'n 



1 

- /3(xi) cos(7rjxi) - 

1 " 1 " 

- /3{xi) cos{TTjxi)-y^[h{T]ikn) - h{0)] and 

1=1 k=l 
^ n n 

— ^cos(7rjxi)^/3(xfc)[/i(0) - h{ri^kn)]- 



k=l 
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Using the boundedness of (3 and Lipschitz continuity of h near 0, we have, for all n 
sufSciently large, 

C 

\h{rjikn) ~ ^(0)1 < —p= for all i and k. 
\Jn 

This fact along with boundedness of (i and the cosines implies that |i3„j + C„j| < 
Cij^Jn for all n sufficiently large and each j. Now, using the second of the two 
conditions in Theorem 2.2, it it straightforward to show that \Anj \ < C^j/n for all 
j and n. 

Combining the results in the previous paragraph, it is clear that 

1 /I 

l<m<v^ m V 

as n — > oo, which completes the proof of Theorem 2.2. 



A. 3. Proof of Theorem 2.3 

The proof of Theorem 2.3 is simple given result (2.7) and the proof of Theorem 2.2. 
Let the alternative function be I3{x)/ ^Jn, where n is the sample size of the ordinary 
OS test, and let the sample size, n*, for the rank-based OS test be 



12ct2/i2(o) 

where [z\ = greatest integer less than or equal to z. If we can show that the limiting 
power of the rank-based OS test with sample size n* is equal to (2.7), then the proof 
is complete. 
We have 



24n 



V24^ Ejn' - 



h{0)f3, 



1 2 



where 



1 " . „ 

— — Ui cos{t: jxi), Ejn' = E(l)j and 5j ~ — Ejn* ■ 

i=l 

The remainder of the proof is virtually the same as that of Theorem 2.2. A key 
point is that, to first order, £'0j is still h{0)Pj/y/ri, which is true because the factor 
^yn in h{Q)(3j / y/n derives from the local alternative and not the sample size n*. 
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